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Three instruments used to evaluate faculty were 
compared: the Student Instructional Report (SIR), produced by the 
Educational Testing Service, the College Instructional Evaluation 
Questionnaire (CISQ) , produced by the University of Arizona, and the 
Instructional Development and Effectiveness Assessment (IDEA), 
produced by Kansas State University. Information is provided on 
student and faculty preferences, correlations among instruments and 
scale scores*, and content. The /three instruments, were administered to 
426 students at 16 selected classes at Sam Houston State University. 
SIR appeared to measure the differential components of teaching with 
more clarity and was preferred by students and faculty ever the other 
two instruments. It also had the greatest amount of feedback 
available. CIEQ was simpler to read, was shorter, and had fewer 
categories. IDEA, which seemed the most complex of the three, had' 
many categories and much feedback, but was designed more for faculty 
development purposes than were the other two. In terms of cost, SIR 
was the most and CIEQ the least expensive. Since a high degree of 
correlation was found among the instruments, a sinqle general factor 
underlying student ratings of instruction seemed tt exist. Brief 
descriptions of each instrument are included. (SK) 
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A COMPARISON OF THREE TEACHING EVALUATION INSTRUMENTS 

A. JERRY BRUCE 

The' evaluation of instruction/faculty by students has received 
much attention in the past few years. However, little attention 
has been paid to the comparison of the various rating, instruments 
avail abTe. The present paper attempts to compare three of the 
most widely used Instruments: The Student Instructional Report 
(ETS), The College Instructional Report (University of Arizona), 
and The Instructional Effectiveness Assessment (Kansas State 
University). Student preferences, faculty preferences , 
correlations among instruments and scale scores, and analyses of 
content are reported. 
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A COMPARISON OF THREE TEACHING EVALUATION INSTRUMENTS - 

A. JERRY BRUCE 
SAM HOUSTON STATE UNIVERSITY 
{Paper Iron the Thirty-first Annual Convention 
Qt the Southwestern Psychological Association, 

April 20, 1985) 

One ox the aost trying problems racing college and 
university administrators is that of faculty evaluation* As 
■erit pay and other performance based approaches to faculty 
salaries and other reward systeas are promoted* the problem 
of evaluation be coses even sore critical. Along with 
Increased evaluation, the role of acadeaic evaluator becomes 
sore visible. Most administrators are eager to have 
available more objective means for decision Baking in this 
tela. 

Faculty perform a variety of functions within the 
university and college community; they do research; they 
"perform community service; they serve on various college 
and university committees; and. they teach. The most 
aiiiicult function* to measure is teaching. One university 
administrator was once overheard addressing his faculty on 
this subject saying, "Teaching is by far the most important 
task you perform at the university, but we cannot neasuce 
it; therefore, your promotions, salaries, etc. will be 
determined by something we can .measure* your research and 
publication records." One is reminded of the story of the 
young boy searching for his lost coin one night under a 
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street li^ht* A passerby stopped to give assistance and 
questioned the lad about where the coin was lost. The ooy 
pointed into a dark alley soae distance away. The surprised 
Samaritan asked the boy why he was loojclng here under the 
street light ^yln fact the coin was lost there in the 
alley. The young philosopher replied to the obviously less 
intelligent adult, "The light is better here.' 1 

Many approaches have been used in attempting to measure 
teach in j, but by far the most widely used aethod is that of 
student evaluations (Centra, 1979). Student evaluation of 
faculty per f oraance is not always popular on college 
campuses, but it is a reality and it is, as the research 
tends to show (Centra, 1979), the most reliable of the 
aethod available and, perhaps, possesses the fewest severe 
side effects. 

In developing a program of student evaluation of 
faculty performance, one of the obvious problems is that of 
choosing an instrument. As Milton et al. (1978) points out 
one should not casually produce a homemade device and 
quickly introduce it for the purpose of making important 
cecisions. There are many standardized Instruments 
available so why reinvent the wheel? However, the 
administrator must decide which one of the many available 
instruments is best for his/her purpose. The present report 
relates soae data hopeful!* relevant to this point. 
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Method 

The present research compares three of the most widely 
used (Centra, 1979; L. M. Aleamoni, personal 
coamunrcation. May, 1983) instruments for student evaluation 
of t acuity: The Student Instructional Report (SIR) produced 
by Educational Testing Service* The College Instructional 
Evaluation Questionnaire (CIEQ) produced by the University 
of Arizona* and The Instructional Development and 
Effectiveness Assessment (IDEA) produced by Kansas State 
university* These three instruments were administered to 
select classes duriny a summer session at Saa Houston State 
University* 

Subjects* 

There were 16 classes involved in the administration ot 
the three instruments, a total of 426 students* Tne classes 
here selected in an atteapt to represent a cross-section of 
the university population. The following cr jria Mere 
Lsed: 

1. At least one class from the following levels: 
(a) Lower level required class of 30 to 50* 
<b) Lower level lecture class not required of 30 to 50 

(c) Lower level lecture/lab class of 20 to 50. 

(d) Upi»er level lecture class ot 20 to 40. 
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(e) Upper level non-lecture course of 10 to 30* 

(f ) Master's level class of 7 to 20. 
Cs) Doctoral level class of 5 to 10. 

2. At least one course from each college within the 
university's organizational structure. 

3. No class of less than 5* 

4. Class taught by full-time regular faculty. 
Instruments. ° 

SIR. The SIR instrument is a 39 item questionnaire. 
In addition It contains a space for five iteis selected by 
the local faculty member. The items cover a wide variety ot 
topics: course organization and planning, faculty/student 
interaction* Cusa-mications course difficulty and workload/ 
textbooks and readings* tests and exams, overall 
evaluations/ student and course descriptive items, local 
options Items, and miscellaneous, if* is given froa 

Educational Testing Service by percent responding t> each 
item, item means, percentile equivalent of means, and scores 
(percentiles and factor scores) on six factor scores (Table 
I) these factor scores are based on previously identified 
factors from factor analysis (Centra; 1973). Comparative 
cata for more than 30 academic disciplines/ various class 
sizes, school size, level of class, and type of cidss are 
available separately. 
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Cl EG. The CI£Q instrument contains 21 standard iteas 
plus seven openended questions on the reverse side o£ the 
tors. The iteas cover a wide variety of topics as can be 
s-ien in Table 1. Feedback is given in tfte proportion and 
frequency of responses/ Beans, and standard deviations on 
each or the 21 iteas* Scores are also given for five scales 
scores plus a total. These scales are based on previously 
Identified factors froa factor analytical research 
(Aleaaoni, 1978). In addition there are several scores 
given for the scales and total comparing the individual 
faculty member's evaluation with other faculty evaluations 
fros similar courses. 

IDEA. The IDEA fora contains 39 nuabered stateaents 

< 

and seven lettered stateaents. The instructor also has the 
option of including up to five additional iteas of his/her 
choosing. The statements are grouped into sections labeled 
the instructor, progress on (the student is asked to compare 
his/her progress in this course to other courses being 
taken), the course, self-rating, and the respondent 
characteristics. Feedback is presented to the instructor by 
subject tatter aastery, developaent of general skills and 
personal developaent, i.e., how such progress the students 
have aade in these areas. Other sections present 
descriptions of the course, the student's self rating, the 
teaching methods used, a section for additional questions, 
and finally a diagnostic summary. Scores are presented as 
frequencies of the five-point rating scale, means g a 
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cif ference score and a translation (nigh/ high average/ 
average,, low average- low). The diagnostic sumaary is 
presented with scores on teaching fie t hods most needing 
attention* The iactor scores are presented in the suaaary 
profile and contain the seven categories presented in Table 
i. 

Procedure. 

£ach student completed ail evaluation forms CSIK, CIEQ/ 
IDEA/ anc the brief preference questionnaire) in the same 
class period. Each class was given a prdeterained sequence 
of administration for the three instruments. These 
sequences of adsinistration were randomly selected by the 
experimenter to control for order of adsinistration. 

Results and Discussion 

Questionnaire Correlations. 

Us in <j the IDEA overall evaluation -score/ the CIEQ total 
score/ and a composite score derived from iteas 38 and 39 
from SIR/ Spearaan rank-order correlation coefficients were 
calculated. The results (see Table 2) were across the hoara 
extremely high. If one is using the test for evaluation and 
an overall score is needed/ all three tests sees to do the 
Jon equally nell or poorly. If one desires information for 
faculty development purposes then the SIR seeas to this 
writer more appropriate. 
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The correlations between the various factor scores on 
the test were also ox interest* The Spearaan rank-order 
correlations between the seven factor scores' on the IDEA 
tanged froa .56 to .91 (see Table 3). if the exams factor 
score is eliminated the lowest correlation is .72. One' 
might argue that the factor scores add little information or 
that good teachers are good on all accounts and vise versa. 
On the six factor scores of the CIEQ the saae can bs sard 
(see Table 4). The lowest correlation was .81. The SIR on 
the other hand had factor scores that correlated at auch 
lower levels with theaselves (see Table 5), the lowest 
correlation being .16 and the highest .90, eliainating the 
composite score the range was .16 to .70. The SIR 
text/reading score was not reported in a majority of the 
cases, this score is a composite of iteas 32 and 33/ 33 
asking the students to rate the readings. Many of the 
classes cid not have readings other than the text; 
therefore^ the score could not be calculated. It would 
appear that the separate measures are measuring different 
elements. 

Cospar Ing the factor scores of the three instruaents is 
somewhat difficult since each instrument has a different set 
of factor scores. Review of the evaluation literature by 
Centra (1972) revealed three coaaon factors in aost 
instruaents: Organization or structure, teaching skills or 
communication, and student rapport or empathy* There was a 
possible fourth factor, student effort or involvecent. It 

« 
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is difficulty to identify these factors in the three 
instruments being reported here. Nevertheless/ comparison 
of the CIEG factor scores with factor scores of the IDEA 
revealed a remarkable degree of similarity (see Table 6). 
The correlations ranged troa .75 between CIEQ method and. 
IDEA overall to .95 between CIEQ total and IDEA creating 
enthusaisn. The correlation between the SIR and the other 
two questionnaires were such sore varied (see Tables 7 and 
8}, SIR/CIEQ ranging from .35 to .94 and SIR/ IDEA from .35 
to • 92* It seems rather clear from these results that the 
SIR coses closer to differentiating characteristics than the 
other two. 

The high degree of correlation among these instruments 
suggests the existence of a single. general factor underlying 
student ratings of instruction. 

Faculty/Student Preferences. 

The 16 faculty and 426 students were administered a 
brief questionnaire. Of the 16 faculty only 10 returned 
their torus foe a return rate of 62.5%. Of the 426 

tudents, 332 returned the questionnaire for a return rate 
of 77.9%. In the guest ionnaire the subjects were asiced 
which of the instruments they preferred, which they 
considered least difficult for the students to complete and 
understand, which provided the soundest judgement of faculty 
efforts, and finally, to the faculty only, which provided 
the best information for faculty evaluation and development. 
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superior and the IDEA was rated lowest (see Table 9). These 
eatings suggest a greater face validity for the SIR and that 
faculty and students see it as nore useful. 

Content Evaluation. 

•> 

On all three questionnaires there seeaed to oe a 
balanced attempt to equalize the iteas directly related to 
the instructor with the iteas directly related to the course 
content. Questions on examinations were found in the SIR 
ana IDEA but on the CIStf the only exan question was an 
openended iteas on the back of the questionnaire. Specific 
questions on the textbook, readings/ and laboratories were 
found on the SIR but only implied in the IDE* and only in 
the openended items of the CIEQ. 

The CIEQ questions were simpler and shorter. The 
average nuaber of words per questions were 7.81 for the 
CIEU, 10.21 for the SIR, and 9.83 for the IDEA. 

Each of the three contained iteas that greatly 
overlaped and each had iteas that were unique to *t; 
hoaever, the IDEA contained some of the a ore interesting 
unique itecs, e.g., M l have given thoughtful consideration 
to the questions on this form," "How well did the questions 
on this tors permit you to describe your iapressions of the 
instructor and course?** and "For how aany courses have you 
tilled out this form during the present tera?" 
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On© other consideration of content regarded the 



ques^tibn of, student acquiescent tendencies, a well known 
consideration in questionnaire construction. A yood 
questionnaire should have a balance of yes/no responses so 
that yes and no both will be used to indicate a positive 
evaluation of the person or concept under consideration* 
The SIk/ in order to indicate that the instructor did a good 
Job, only in two cases was the respondent required to 
indicate no; on the IDEA four tines; tou.i on the CIEy 11 
tines* The CIEQ was the only instrument that seriously 

• * t 

t 

attempted to balance the yes/no reponses. 

Conclusion 

a.' 

V 

it is difficult to say which of these questionnaires 
M ouid he best for a specific situation without examining 
carefully the needs of that specific situation. 
Nevertheless, one nijjht say that the SIR appeared to aeasure 
the differential cooponents of teaching with sore clarity, 
the Slti nas preferred by students and faculty over the .other 
two ins true en ts, and the SIR had the greatest amount of 
feedback available. On the other hand the CIEQ was slcpler 
to read, it was shorter, and it had fewer categories. The 
IDEA seeeec the most complex of the three. IDEA had a 
•ultltude of categories of itees and feedback, but it was 
oesijn acre for the purpose of faculty development pfrhaps 
than the other two. 
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One last point, the issue of cost, on this the SIR *on 
"going aha*.* 1 The SIR was «uch acre expensive than the other 
two instruments* The CIEQ was the cheapest. 

> 

The evaluation of teaching is not an easy task. As 
Tucker (1984) says, "The art of evaluating the per for stance 
ot faculty aeshers is not that yell developed* 1 (p. 151). 
But it is an important task and one that needs increasing 
effort froa the research coaaunity- This report is meager 
and filled with aany problems but it is a attespt to begin 
the evaluation of the available instruments. 
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TABLE 1. 

COMPARISON OF EVALUATION INSTRUMENTS' SUBSCQRES 
SIR Faculty/student interaction 

m 

Coaauni cations 

t 

Teats, exams, textbook, and readings 
Course organization and planning 
Course difficulty and workload 
Student interest 
Overall 

ClEu General attitude 

Method (of instruction) 

Content 

Interest 

Instructor 

Total 

IDEA Outcoaes 

Overall evaluation 
Would like instructor again 
lop roved attitude toward field 
Method 

Involving students 
Communicating content and purpose 
Creating enthusiasa 
Preparing examinations 



BEST COPY AVAILABLE 
14 



« • 



TABLE 2 

COMPARISCN OF OVERALL EVALUATIONS OK CIEQ, IDgA, SIR 

CI EG IDEA SIR 

CIEQ 

IDEA .69 — * 

SIS .94 .88 — 
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TABLE 3 
IDEA SOBSCQRE CORRELATIONS 



CUTCCKE 
. overall 

would like Instructor again 

* 

i ■proved attitude touard . field 
METHOD 



.72 

.74 



.91 



involving students 


.86 


• 81 


.64 




communication of content & purpose 


• 86 


• 80 


.77 


.83 


creating ettthusaiss 


.86 


• 90 


.78 


•92 


preparing exams 


.70 


• 68 


• 56 


• 82 



• 81 
•71 



•70 



IB 



TABLE 4 
CXEQ SU8SCCRE CORRELATIONS 



Goneral attitude 








Method 


• 81 






Content 


.81 


•87 




Into cost 


-94 


• 82 


• 85 


Instructor 


• 82 


• 88 


•83 


Total 


• 92 


.93 


• 88 



• 84 

• 92 



95 



4 
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TABLE 5 
SIR SUBSCQRE CORRELATIONS 

Course organization & planning ~ 

Faculty/ student interaction .58 

Communication .29 .70 — 

I ... ■ 

Course difficulty l workload .33 .58 .39 

Text nook & readings — — — 

Test & exass .65 .59 .26 .16 — 

Overall score .54 .90 .83 .43 — .69 
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TABLE < 
CIEQ/XDEA SOB SCORE 

OUTCOME 
overall 

uould like instructor again 
laproved attitude toward field 
METHOD 

involving students 
coeiunicatlon of content & purpose 
creating en thus a is* 
preparing exfss 



COMPARISONS 
general 

attitude content instructor 

aethod interest total 



• 88 


.75 


• 92 


•90 


.82 


• 89 


• 81 


.98 


•81 


• 84 


• 91 


• 93 


.86 


.89 


• 80 


• 82 


.81 


.86 


.79 


.82 


• 85 


• 87 


.87 


.91 


.76 


• 84 


•90 


•78 


• 80 


• 85 


• 87 


.38 


• 84 


.94 


•90 


• 95 


• 69 


.75 


.77 


.63 


.80 


• 80 
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TABLE 7 

ICEa/SIS SUSSC3RE CORRELATIONS 



OUTCOME 



course overall 
organization text & score 

& planning coenunication readings - 
. faculty/ course tests 

student difficulty & exams 

interaction & workload 



overall 


.37 


.73 


• 41 


• 43 




• 60 


.88 


would like instructor again 


.75 


.77 


.67 


• 52 




.57 


.82 


iayroved attitude toward lie Id 


.63 


.57 


• 43 


• 52 




.52 


.70 


METHOD 
















involving students 


.45 


.91 


.77 , 


• 47 




• 57 


.95 


coaaunication oi content & purpose 


.55 


.75 


.50 


• 57 




.62 


.79 


creating entnusaism 


.57 


.79 


• 65 


• 35 




•55 


.92 


preparing exaas 


.51 


.92 


.60 


• 48 




• 66 


• 89 
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CiEg/SIR SUSSCCRE CORRELATIONS 

method interest total 

general 

attitude content instructor 



course organization & planning 


.42 


.69 


.52 


.47 


.71 


.59 


t acuity/student interaction 


.63 


.81 


.80 


.68 


.38 


.85 


coiiBUiucation 


.51 


.66 


• 53 


.59 


.59 


.62 


course ditticulty k workload 


.40 


.58 


• 46 


.35 


.53 


.56 


textoook & readings 














tests & exaas 


.53 


.61 


.75 


.59 


.56 


.57 


overall score 


.86 


.84 


• 87 


,89 


• 90 


• 94 
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TABLE 9 

FACdLTYVSTUDEUT OPINION QUESTIONNAIRE RESULTS 
Questions CXEQ IDEA SIR Difference 



prefer 



least difficult for 
student to: 

complete 



understand 



allowed the student 
to provide soundest 
judgement 



provides inforaation 
necessary for faculty: 

evaluation 



development 



faculty n 
t 

students n 
1 



faculty n 
% 

students n 



faculty 



n 
t 



students n 
* 



faculty n 
I 

students n 
1 



faculty n 
t 

faculty n 
% 



3 

(30.0) 

76 
(22-9) 



1 

(10.0) 

71 
(21.4) 

1 

(10.6) 

66 
(19.9) 



3 

(30.0) 

95 
(28.7) 



3 

(30.0) 
2 

(20.0) 



0 

(0.0) 

37 
(11.2) 



1 

(10.0) 
16 

(4.9) 
0 

(0.0) 
16 

(4.9) 



0 

C0.0) 

41 
(12.4) 



0 

(0.0) 

1 

(10.0) 



6 

(60.0) 

189 
(57.0) 



7 

(70.0) 

196 
(59.1) 

6 

(60.0) 

191 
(57.6) 



6 

(60.0) 

154 
(46.4) 



5 

(50.0) 



1 

(10.0) 
30 

(9.1) 



1 

€10.0) 

49 
(14.8) 

3 

C30.0) 

59 
(17.8) 



(10.0) 

42 
(12.7) 



2 

(20.0) 



4 3 
(40.0) (30.0) 
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